Average-case linear-time similar substring searching by the q -gram distance
نویسندگان
چکیده
منابع مشابه
Average-case linear-time similar substring searching by the q-gram distance
In this paper we consider the problem of similar substring searching in the q-gram distance. The q-gram distance dq(x, y) is a similarity measure between two strings x and y defined by the number of different q-grams between them. The distance can be used instead of the edit distance due to its lower computation cost, O(|x| + |y|) vs. O(|x||y|), and its good approximation for the edit distance....
متن کاملTowards Distance-Based Phylogenetic Inference in Average-Case Linear-Time
Computing genetic evolution distances among a set of taxa dominates the running time of many phylogenetic inference methods. Most of genetic evolution distance definitions rely, even if indirectly, on computing the pairwise Hamming distance among sequences or profiles. We propose here an average-case linear-time algorithm to compute pairwise Hamming distances among a set of taxa under a given H...
متن کاملPosition-Restricted Substring Searching
A full-text index is a data structure built over a text string T [1, n]. The most basic functionality provided is (a) counting how many times a pattern string P [1,m] appears in T and (b) locating all those occ positions. There exist several indexes that solve (a) in O(m) time and (b) in O(occ) time. In this paper we propose two new queries, (c) counting how many times P [1,m] appears in T [l, ...
متن کاملOne-Gapped q-Gram Filtersfor Levenshtein Distance
We have recently shown that q-gram filters based on gapped q-grams instead of the usual contiguous q-grams can provide orders of magnitude faster and/or more efficient filtering for the Hamming distance. In this paper, we extend the results for the Levenshtein distance, which is more problematic for gapped q-grams because an insertion or deletion in a gap affects a q-gram while a replacement do...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Theoretical Computer Science
سال: 2014
ISSN: 0304-3975
DOI: 10.1016/j.tcs.2014.02.022